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Objectives: Congenital muscular torticollis, a common disorder that refers to the shortening of the sternocleidomastoid in 
infants, is sensitive to correction through physical therapy when treated early. If physical therapy is unsuccessful, surgery is 
required. In this study, we developed a support vector regression model for congenital muscular torticollis to investigate the 
prognosis of the physical therapy treatent in infants. Methods: Fifty-nine infants with congenital muscular torticollis received 
physical therapy until the degree of neck tilt was less than 5°. After treatment, the mass diameter was reevaluated. Based on 
the data, a support vector regression model was applied to predict the prognoses. Results: 10-, 20-, and 50-fold cross-tabula- 
tion analyses for the proposed model were conducted based on support vector regression and conventional multi-regression 
method based on least squares. The proposed methodbased on support vector regression was robust and enabled the effective 
analysis of even a small amount of data containing outliers. Conclusions: The developed support vector regression model is 
an effective prognostic tool for infants with congenital muscular torticollis who receive physical therapy. 
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I. Introduction 

Torticollis is a condition that head is tilted to one side, and 
chin is tilted to opposite side. Congenital muscular torticollis 
(CMT) is a common disorder which is caused by short-en- 
ing of the sternocleidomastoid for infants [1-4]. In general, 
CMT is occurred at birth or up to two months old, and it 
reacts sensitively to physical therapies. If the condition is not 
recovered by early treatment, the infants with CMT will be 
unable to move head properly, and operative treatments [5] 
or Botox injection treatments [6,7] will be needed to correct 
the shortened muscle. Therefore early diagnosis and treat- 
ment are extremely important for CMT [2,3] . Mostly, CMT 
is detected in early stage by parents, and physical therapies 
are performed steadily. However procedure of physical 
therapy is not systematically organized, and researches for 
the prognosis of CMT according to treatments are still inad- 
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equate [8,9]. 

Recently research of prediction model based on conven- 
tional multiple regression to analyze the prognosis of CMT 
according to treatments has been proposed [9]. However the 
performance of the conventional statistical models depends 
on quality and amount of the training data, and the shapes of 
the models that are constructed by conventional methods are 
extremely influenced by outliers, because the conventional 
methods are based on empirical risk minimization (ERM) 
[10,11]. Therefore, in data mining, it is recommended to 
perform regression after eliminating low quality data such 
as outliers from large amount of training data as possible 
[11,12]. Nevertheless to collect large amount of high quality 
data through observation and experience is limited and dif- 
ficult, especially, for clinical data. Moreover the outliers in 
clinical data are recognized as objects with major informa- 
tion. Thus a method which is robust and possible to analyze 
effectively clinical data is needed. 

Based on the needs, in this study, we propose support vec- 
tor regression (SVR) -based method, which has degree of 
neck tilt, face symmetry, initial mass diameter, age, and 
treatment duration as the independent variables, to predict 
the change of mass diameter after treatments, and to over- 
come the drawbacks of conventional methods. SVR is not 
based on ERM but based on structural risk minimization 
(SRM), and SVR performs data analysis in feature space of 
high dimension which is mapped by kernel function such as 
polynomial, Gaussian radial basis function (RBF), exponen- 
tial radial basis function, spUnes, and etc [13-16]. Therefore 
SVR is able to minimize generalization error effectively, and 
minimize the influence of amount and quality of the col- 
lected data [10]. We show the effectiveness of the proposed 
SVR-based prediction model for CMT through experiments 
based on data with fifty nine infants of CMT and discuss it. 

II. Methods 

1 . Object of study and data collection 

In this study, data of fifty nine infants with CMT who visited 
D medical center in Daegu is collected from April 2003 to 
December 2008. During the same period, others who have 
neurologic problems, congenital malformations of cervi- 
cal vertebra, and ocular torticollis were not included in this 
study. Data is categorized according to sex, age, initial and 
final mass diameter, treatment duration, degree of neck tilt, 
face symmetry, occipital symmetry, and treatment methods 
that are summarized in Table 1. 

The physical therapies are performed twice a week by three 
people who have more than five years experience for CMT 
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and the therapies are finished when the degree of neck tilt 
of infants with CMT is less than five degree, and asymmet- 
ric form is disappeared. Duration of the physical therapies 
and change of the mass diameter after treatment are 90.66 ± 
37.48 mm and 6.36 ± 2.56 mm respectively. 

2. SVR-based prediction model 

Regression is a statistical method which is widely used to 
predict and analyze the relationship of data between a de- 
pendent variable and one or more independent variables. 
Many kinds of regression methods have been proposed and 
these can be classified as simple regression and multiple 
regressions according to the number of independent va- 
riables. Regression model can be represented as a function/ 
which is related dependent variable Y. 

Y^fiX, (3) (1) 
where X is independent variables and (3 is unknown param- 
eters. 

To estimate appropriate regression model, it is recom- 
mended to perform regression with large amount data after 



Table 1. Categories of data and statistical information 







No. (o/o) or 


Categories 


mean + standard 






deviation 


Sex 


Male 


31 (52.5) 




Female 


28 (47.5) 


Age (day) 




34.73 ± 27.94 


Initial mass 




12.41 ± 4.55 


diameter (mm) 






Final mass diameter 




6.05 ± 3.94 


(mm) 






Treatment duration 




90.66 ± 37.48 


(day) 






Degree of neck 


5-15 


26 (44.1) 


tilt n 








Over 15 


33 (55.9) 


Face 


Symmetry 


22 (37.3) 




Asymmetry 


37 (62.7) 


Occipital 


Symmetry 


15 (25.4) 




Asymmetry 


44 (74.6) 


Treatment 


Postural training 
and education 


21 (35.6) 




Manual stretching 


38 (64.4) 




and tducation 
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eliminating outliers by using standardized residual or stu- 
dentized residual, because the conventional regression mod- 
els are extremely influenced by amount and quality of data as 
shown in Figure 1. 

However to collect large amount with high quality of data is 
limited, especially, in clinical cases, and outlier is recognized 
as important object which contains useful information. 
Therefore, in this study, SVR-based method, which is pos- 
sible to analyze small amount of clinical data with outliers 
effectively, is proposed to overcome the drawbacks of the 
conventional statistical analysis methods. 

SVR is constructing regression model by minimizing gen- 
eralization error based on SRM while other statistical meth- 
ods are based on ERM [14]. Moreover SVR performs data 
analysis in high dimensional feature space which is mapped 
by kernel function such as polynomial, RBF, and etc. There- 
fore SVR is possible to deal with small amount of data with 
outUers. 

Suppose the given training data {{Xj, 7,),— .fe, }'w)}<^X ^ 
where N denotes number of data, x, y denote input and out- 
put vector respectively, and 1 denotes input data space. The 
goal of SVR is to find a function / that has minimum w less 
than £ deviation from the target y,. 

f{x) = <a), x> + b with w G 5^, e (2) 

where <•,•> denotes the dot product in 1. 

Convex optimization problem to find minimum to can be 
rewritten as follows. 



minimize — ku 
2 II II 



subject to 



-{cD,x.)-b < s 
{oj,x-) + b-yi <s 



(3) 



12n 




10- 






• 


8- 








6- 




4- 




2- 




0- 


— r 



Data 

Regression result without outlier 
Regression result with outlier 




0 1 



3 4 5 6 7 8 
Input 

Figure 1. Difference of regression according to outlier. 



10 11 



However, from the Equation (3), it is impossible to analyze 
the training data that exists out of the bound £. Therefore, to 
overcome it, slack variables ^j, ^, are applied, then Equation 
(3) can be rewritten as follows. 



minimize 



subject to 



1 , 



-ci:(f,+#;) 



— (a), x.^ — b < s + ^. 
{(o,Xi) + b - y, < s + ^' 



(4) 



In here, regularization constant C > 0 controls model com- 
plexity by determining how much of error will be allowed 
for training data that is out of the bound e. 

By applying Lagrange function to solve Equation (4) and 
kernel function to map data into high dimensional space, it 
can be rewritten as Equation (5) and (6) [13-15]. 



subject to 



jE(«.-«*)=o 

a,, a* e [0,C] 

N 

J=l 

N 



(5) 



(6) 



where k {x, x') := <0 (x), O {x')> and O denote mapping 
function and k denotes kernel function. 

Typical kernel functions and their parameters are summa- 
rized in Table 2. 

The performance of SVR is closely related to SVR para- 
meters and kernel function. The examples for SVR param- 
eters (C: regularization constant, e: error bound) are shown 
in Figure 2. 

From Figure 2A and 2B, SVR parameter C is related to the 
complexity of model. That is, the complexity of regression 
model is increased as C is increased. Therefore, we could 
compose a model which has minimum training error by 
selecting higher value of C. However, testing error will be in- 
creased with higher value of C, because of overfitting prob- 
lem. 

Moreover Figure 2C and 2D show the influence of SVR 
parameter e. e controls model complexity similar with C. 
The difference of C and e is that C controls complexity by 
adjusting the error sensitivity of training data, and e controls 
complexity by adjusting the number of support vectors. As 
shown in Figure 2C and 2D, the number of support vector is 
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decreased and the regression model is simplified as e is in- 
creased. 

Therefore, to select optimal parameter is important to con- 
struct proper regression model based on SVR. In this study, 



Table 2. Typical kernel function 



Type of kernel 


Kernel function 


Kernel 
parameter 


Polynomial 




d 


Gaussian radial 
basis function 




a 


Multilayer 
perceptron 




lio.lii 


Splines 


y) 

l+xy+xymm(x, y)- — ^ — {mm{x,y)Y 
+ -^{mm{x,y)f 





C: 100; Bound: 0.3; Kernel parameter: 3. 



• Test data 




~i 1 1 1 1 1 1 1 1 1 

2 4 6 8 10 12 14 16 18 20 

Input 



c 







c 


500; Bound: 1; Kernel parameter: 3. 






• 


Test data 




25- 


▲ 


Training data 






■ 


Support vector 




20- 




- Regression result -y^^ 








- Bound line(8) 


Q. 


15- 




# 


"3 








O 


10- 








5- 
















0- 










1 

2 


1 1 1 1 1 1 1 1 

4 6 8 10 12 14 16 18 



Input 



SVM-based Prognosis Model for CMT 

we have tested on several conditions (C: 500 and 1,000; £ : 
0.001 and 0.005; and Gaussian RBF kernel parameter: 2 and 
4). According to the condition, we have constructed SVR 
prediction model to analyze the change of mass diameter on 
CMT. 

To analyze the change of mass diameter on CMT, we have 
collected data of fifty nine infants with CMT. From the CMT 
data, we select independent variables based on t-test and 
Pearson's correlation coefficient. After selecting indepen- 
dent variables, we constructed SVR prediction model using 
Gaussian RBF kernel. The algorithm of this study is summa- 
rized as follows. 

Step 1] Collect data of infants with CMT. 

Step 2] Select independent variables using t-test and Pear- 
son's correlation coefficient method. 

Step 3] Construct SVR prediction model according to SVR 
parameters. 

Step 4] Evaluate the performance of SVR model according 

B 

C: 500; Bound: 0.3; Kernel parameter: 3. 
• Test data . 




-T 1 1 1 1 1 1 1 1 1 

2 4 6 8 10 12 14 16 18 20 



Input 



D 

C: 500; Bound: 3; Kernel parameter: 3. 
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Test data • 


25- 


▲ 


Training data ■ 




■ 


Support vector 


20- 
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15- 






10- 






5- 












0- 







T 1 1 1 T 



2 4 6 8 10 12 14 16 18 20 
Input 



Figure 2. Example of support vector regression with Gaussian radial basis function. 
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to parameters based on root mean square error 
(RMSE). 

The architecture of SVR is shown in Figure 3. 

III. Results 

The change of diameter is used as dependent variable, and 
t-test, Pearson's correlation coefficient methods according 
to data types have been used to select optimal independent 




> Output 



Input Hidden Weights Output 

layer layer layer 

(Mapping, Dot product) 

Figure 3. Architecture of support vector regression. 
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variables. The results of f-test and Pearsons correlation coef- 
ficient methods are summarized in Tables 3 and 4. 

In general, |i-value is set as under 0.05. However the data 
set used in this study is not large amount of data set without 
outliers. Therefore we have set thep-value as 0.1 to consider 
many cases as possible. Based on thep-value, degree of neck 
tilt, face symmetry, initial mass diameter, age and treatment 
duration have been selected as independent variables which 
have relationship with change of mass diameter from the re- 
sult of statistical hypothesis tests shown in Tables 3 and 4. 

Using the selected independent variables, we have con- 
structed SVR regression model with Gaussian RBF kernel 
according to parameters. To show the effectiveness of the 
proposed SVR model, we demonstrated 10-, 20-, and 50- 
fold cross tabulation analysis (training 15 [25%] and test 44 
[75%]). Moreover RMSEs between the proposed SVR-based 
method and conventional multi-regression method using 
least squares have been compared. The results are summa- 
rized in Table 5. 

Table 5 shows RMSE results of the proposed SVR predic- 
tion model and conventional multi-regression model ac- 
cording to 10-, 20-, and 50-fold cross tabulation conditions. 
From Table 5, the proposed method showed 2.66 ± 0.07, 2.66 
± 0.06, and 2.69 ± 0.03 results according to cross tabulation 
conditions (10-, 20-, and 50-fold). Moreover the convention- 



Table 3. The results of T-test according to parameters and change of diameter 





Categories 




Change of diameter (mm) 


Features 


State 


No. 


Average + SD 


f 


p-value 


Degree of neck tilt 


5-15 


26 


5.27 ± 2.27 


-3.105 


0.003 




Over 15 


33 


7.21 ± 2.47 






Face 


Symmetry 


22 


5.59 ±2.19 


-1.806 


0.076 




Asymmetry 


37 


6.81 ± 2.67 






Occipital 


Symmetry 


15 


5.73 ± 2.91 


-1.093 


0.279 




Asymmetry 


44 


6.57 ± 2.42 






Treatment 


Postural training and education 


21 


6.90 ± 1.86 


1.231 


0.224 




Manual stretching and education 


38 


6.05 ± 2.84 






SD: standard deviation. 












Table 4. The results of Pearson's correlation coefficient 












Change of diameter Initial mass diameter 


Age 


Treatment duration 


Change of diame-ter 


1.000 










Initial mass diameter 


0.506 (0.000) 


1.000 








Age 


-0.240 (0.068) 


0.028 (0.833) 


1.000 






Treatment duration 


0.307 (0.018) 


0.182 (0.169) 


-0.044 (0.741) 




1.000 



Values are presented as r (p-value). 
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Table 5. Average and standard deviation of RIVISE results according to methods 





SVR parameters 






RMSE results 






L 


Bound (s) 


Kernel 


SVR results according to parameters 


Conventional multi-regression results 


parameter 1 


[a] 10-fold 


20-fold 


50-fold 


10-fold 


20-fold 


50-fold 


500 


0.001 


2 


2.67 


2.68 


2.67 


2.90 


2.81 


3.18 


500 


0.001 


4 


2.64 


2.75 


2.70 


2.65 


4.30 


3.25 


500 


0.005 


2 


2.61 


2.61 


2.68 


3.86 


4.54 


3.94 


500 


0.005 


4 


2.77 


2.60 


2.69 


2.92 


3.11 


2.93 


1,000 


0.001 


2 


2.59 


2.67 


2.67 


3.27 


2.80 


3.41 


1,000 


0.001 


4 


2.59 


2.69 


2.66 


2.54 


3.69 


2.77 


1,000 


0.005 


2 


2.66 


2.57 


2.67 


2.75 


2.68 


2.81 


1,000 


0.005 


4 


2.74 


2.69 


2.74 


5.60 


2.94 


3.48 


Average±standard deviation 


2.66±0.07 


2.66±0.06 


2.69±0.03 


3.31±1.01 


3.36±0.73 


3.22±0.39 


RMSE: root mean square 


error, SVR: 


support vector rej 


iression, C; 


regularization constant. Bound (g); error 


bound, kernel 



parameter(CT); standard deviation of Gaussian radial basis function kernel. 



al method showed 3.31 ± 1.01, 3.36 ± 0.73, and 3.22 ± 0.39 
results according to cross tabulation conditions (10-, 20-, 
and 50-fold). 

Based on the overall comparison results, the proposed 
method provides more robust and steady results than the 
conventional method. Especially, from the last results of 10- 
fold, standard deviations of the proposed method and con- 
ventional multi-regression method are 0.07 and 1.01 respec- 
tively. That is, the proposed SVR method provides proper 
results with minimizing influence of amount and quahty of 
the collected data. 

Through the experiment, it is shown that the proposed SVR 
model precisely predicts the change of diameter by reducing 
the efl^cts of amount of data and quahty, while the result of 
the conventional regression method depends on amount and 
the quality of data. 

IV. Discussion 

Congenital muscular torticollis is a common disorder that 
the sternoclavicularmastoid is shortened by fibrosis for in- 
fants. Most of all infants with CMT are discovered within 
two weeks and get physical therapies to treat CMT. However 
the researches to analyze the prognosis of the physical thera- 
pies for CMT are inadequate. Therefore, in this study, we 
proposed a SVR-based method to analysis the prognosis of 
who get the physical therapies to treat CMT, and overcome 
the drawbacks of conventional statistical analysis methods. 

From the results in section III, we have confirmed that 
the treatment methods are not statistically significant as 
mentioned in reference [17]. Moreover we have found out 



that degree of neck tilt, face, initial mass diameter, age, and 
treatment duration are statistically significant under 0.1 ofp- 
value. Based on the selected independent variables, we have 
shown that the proposed SVR-based method provides steady 
and appropriate results from 10-, 20-, and 50-fold cross 
tabulation analysis according to SVR parameters, while the 
conventional method does not. 

In medical informatics which use the clinical data, there is 
a limitation to compose robust diagnosis and prediction sys- 
tems based on lots of data with high quality because the his- 
tory of electronic medical record (EMR) system is not long 
enough to build up large amount of various clinical data in 
domestic case. Therefore diagnosis and prediction systems 
based on the conventional statistical methods, which depend 
on amount and quality of data, have drawbacks. However 
SVR method, which is based on SRM, could provide robust 
results as shown in section III. Thus, in medical informatics, 
we expect that SVR-based system is possible to apply to vari- 
ous other chnical cases to compose diagnosis and prediction 
systems. 

However the selection of parameters is important because 
the parameters are directly related to control the complex- 
ity of regression model. That is, if the constructed model is 
tightly fitted to training data by tuning the parameters, then 
the generalization error will be increased. Oppositely, if the 
constructed model is too simple, then the model could not 
represent the characteristic of data. Various researches have 
been proposed to select the optimal parameters based on 
given data and expert knowledge, especially, in engineering 
application. Nonetheless, the algorithms are mostly based 
on the exhaustive searching method such as grid searching. 
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Therefore the computational complexity is increased as the 
complexity of given feature set is increased. In the case of 
clinical data, the feature set is more complex than the feature 
set of other applications. To overcome the limitation, the 
researches for feature reduction and optimal parameters se- 
lection are required. Thus we remain these issues as further 
studies to be solved. 
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